Semantic Query Optimization for Query Plans of Heterogeneous Multidatabase Systems

نویسندگان

  • Chun-Nan Hsu
  • Craig A. Knoblock
چکیده

New applications of information systems, such as electronic commerce and healthcare information systems, need to integrate a large number of heterogeneous databases over computer networks. Answering a query in these applications usually involves selecting relevant information sources and generating a query plan to combine the data automatically. As signi cant progress has been made in source selection and plan generation, the critical issue has been shifting to query optimization. This paper presents a semantic query optimization (SQO) approach to optimizing query plans of heterogeneous multidatabase systems. This approach provides global optimization for query plans as well as local optimization for subqueries that retrieve data from individual database sources. An important feature of our local optimization algorithm is that we prove necessary and su cient conditions to eliminate an unnecessary join in a conjunctive query of arbitrary join topology. This feature allows our optimizer to utilize more expressive relational rules to provide a wider range of possible optimizations than previous work in SQO. The local optimization algorithm also features a new data structure called AND-OR implication graphs to facilitate the search for optimal queries. These features allow the global optimization to e ectively use semantic knowledge to reduce data transmission cost. We have implemented this approach into the pesto query plan optimizer as a part of the sims information mediator. Experimental results demonstrate that pesto can provide signi cant savings in query execution cost over query plan execution without optimization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quality-driven Integration of Heterogenous Information Systems

Integrated access to information that is spread over multiple, distributed, and heterogeneous sources is an important problem in many scienti c and commercial domains. While much work has been done on query processing and choosing plans under cost criteria, very little is known about the important problem of incorporating the information quality aspect into query planning. In this paper we desc...

متن کامل

Query Scrambling in Distributed Multidatabase Systems

This work addresses the problem of efficient query processing in multidatabase systems distributed over widearea networks. The solution unifies the query scrambling and reduction approaches to dynamic optimization of query processing plans at the data integration stage. The paper presents a new data integration algorithm based on query scrambling and the extended reduction technique. The algori...

متن کامل

Learning Database Abstractions for Query Reformulation

The query reformulation approach (also called semantic query optimization) takes advantage of the semantic knowledge about the contents of databases for optimization. The basic idea is to use the knowledge to reformulate a query into a less expensive yet equivalent query. Previous work on semantic query optimization has shown the cost reduction that can be achieved by reformulation, we further ...

متن کامل

Analysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)

Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis.    Methods: The method of this research is log anal...

متن کامل

Query Decomposition, Optimization and Processing in Multidatabase Systems

One way of achieving interoperability among heterogeneous, federated DBMSs is through a multidatabase system that supports a single common data model and a single global query language on top of different types of existing systems. The global schema of a multidatabase system is the result of a schema integration of the schemas exported from the underlying databases, i.e., local databases. A glo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Knowl. Data Eng.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2000